Merged
Conversation
Upload SSE now streams structured per-phase JSON events (queued, load,
chunk, embed, extract_entities, extract_relationships, extract_claims,
structure, finalize, done/error) instead of 1-4 plaintext messages over
an entire job. The pipeline emits non-blocking ProgressEvents at every
phase boundary; the upload handler relays them on a per-job channel,
JSON-encoded over `data: {...}\n\n` frames. The React modal subscribes
via a shared store-backed hook (parseSseChunk + useSyncExternalStore),
renders a per-file row with the current phase plus chunk counters
during embed, and inline errors when the pipeline reports a failure.
Verified end-to-end against ollama/bge-m3 + llama3.2 — 11 distinct
phase events observed for a single-file upload, terminating on
done:true. Existing job-id-filtering regression test extended to
assert the new JSON wire format.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
When a user uploads a file via the UI, today they see ~4 plaintext messages over the whole indexing job and silent failures on LLM 5xx during extraction. This PR replaces the upload SSE channel with structured per-phase events.
internal/pipeline/pipeline.go) — extendsProgressEventwithFile,ChunksDone,ChunksTotal.indexFilenow emits at every phase boundary: load → chunk → embed (with chunk counters) → extract_entities → extract_relationships → extract_claims → structure. Send is non-blocking so a slow consumer never stalls indexing. Existing CLI callers passnilProgress and remain unaffected.internal/api/handlers.go) — per-job structured event log (jobProgressLogw/ sync.Cond). The upload goroutine creates a bufferedchan pipeline.ProgressEvent, passes it toIndexOptions, and a relay goroutine fans events into the log. The SSE handler atGET /api/upload/progress?job_id=...now writesdata: {json}\n\nframes where{json}is{job_id, file, phase, chunks_done, chunks_total, message, done, error}. Indexing failures emit a terminalphase: "error"event. Legacy plain-text relay preserved when?job_id=is omitted. Route path unchanged.ui/src/hooks/api/useUploadProgress.ts) — fetch + ReadableStream-based SSE consumer with a shared store accessed viauseSyncExternalStore. Robust frame parser (parseSseChunk) handles split frames, comment/retry/id lines, and falls back gracefully on plain-text legacy frames.ui/src/routes/documents/UploadModal.tsx) — per-file row per phase (Linear-style restraint, semantic shadcn classes,aria-live="polite"),embed N/M chunkscounter, inlinetext-destructivefor failures.End-to-end verified against the agent's ollama/bge-m3 + llama3.2 setup: 11 distinct phases observed for a single-file upload, terminating on
done:true. Production server on:37777was not touched.Test plan
make buildcleanmake vetcleanmake test— all packages greencd ui && npm run build— TypeScript + Vite cleancd ui && npm test -- --run— 102/102 tests pass (incl. newuseUploadProgress.test.tsx)TestUploadProgress_JobIDFilteringregression test extended; newTestUploadProgress_StructuredEventFormatasserts the JSON wire shape and ≥6 distinct phases:37788→ POST upload → SSE stream): 11 distinct phases,done:truereceived, server cleanly stoppedCaveats
parseSseChunk(frame parsing) but not the full streaming hook against a mockReadableStream; deferred since the parser is the load-bearing part and the live smoke covers wiring.🤖 Generated with Claude Code